Speaker Independent Spee Using Features Based on Glo

نویسندگان

  • Norihide Kitaoka
  • Daisuke Yamada
  • Seiichi Nakagawa
چکیده

We discussed utilization of features based on the glottal sound source for speaker independent speech recognition. It has been thought that such features as pitch cannot contribute to speaker independent speech recognition because of the dominant speaker dependent factor. In this paper, we tried to utilize pitch, power, LPC residual power, voicing rate, and their regression coefficients as feature parameters for speaker independent speech recognition, and found that regression parameters of F0, power and LPC residual power could improve the performance, especially using covariances between each parameter and conventional MFCC. This showed that the procedure to derive the regression parameters could reduce the speaker dependent factor which appeared as biases of those features, and that the correlation between glottal source information and spectral envelope information (MFCC) worked well. We also tested the parameters on a large-vocabulary continuous speech recognition task and obtained the performance improvement.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phonetic Speaker Id

This paper describes the exploration of text-independent speaker identification using novel approaches based on speakers’ phonetic features instead of traditional acoustic features. Different phonetic speaker identification approaches are discussed in this paper and evaluated using two speaker identification systems: one multilingual system and one single language multiple-engine system. Furthe...

متن کامل

Improved Gender Independent Speaker Recognition Using Convolutional Neural Network Based Bottleneck Features

This paper proposes a novel framework to improve performance of gender independent i-Vector PLDA based speaker recognition using convolutional neural network (CNN). Convolutional layers of a CNN offer robustness to variations in input features including those due to gender. A CNN is trained for ASR with a linear bottleneck layer. Bottleneck features extracted using the CNN are then used to trai...

متن کامل

Dissertation Summary Recognizing Non-native Speech: Characterizing and Adapting to Non-native Usage in Lvcsr

Low-pro ien y non-native speakers represent a signi ant hallenge for large-vo abulary ontinuous spee h re ognition (LVCSR). A ousti models are onfused by a heavy a ent; language models are onfused by poor grammar and un onventional word hoi e. La k of omfort with the spoken language a e ts the fundamental properties of onne ted spee h that have been a fo us of LVCSR resear h; ross-word and inte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002